2025-10-06 Return Oriented Programming

When calling a function the stack has:

  • return address (Saved Instruction Pointer)
  • etc Stack addresses are somewhat easy to guess, so if we arrange for some custom code to be at the new return location we overwrote, we can trick the program into running it for us.

Two Main Solutions

ASLR

Makes the stack more random so it’s harder to guess where things are. But this is expensive.

W+X, or why are we running code off of the stack?

Program code shouldn’t exist on the stack: why are we running from there? We can implement this protection in hardware (fast!). We did this by marking pages of memory as either writable and/or operable. For each page of memory add an extra bit. Program loading is a bit slower (two extra syscalls) but overall security improvement.

This suuucks for JIT programs, since these two extra syscalls have to be done for each block of JIT code, introducing a lot of overhead.

This was bypassable by passing our arbitrary code to the system() function syscall which runs commands in the shell. We can find the location of this function pretty easily by finding the string “/bin/sh” in memory.

How do we fix this? ASLR is good, but we don’t really have enough bits in 32 bit systems. But fortunately around the time we moved to 64-bit. This meant loads more space for ASLR. Also, mostly things decided to pass arguments via registers instead of the stack, which helps prevent stack smashing.

Buuut randomness is still expensive, especially as we get more and more symbols. So instead of loading each function into a random location, lets just load each library at a random offset. This is obviously way faster, but if the address of a single pointer from the library is leaked, then all the pointers are leaked.

Now, using our return to libc attack is a bit harder:

  • Find a buffer overflow
  • Break ASLR by leaking a pointer to a library
  • Return to main to restart the program without re-randomising
  • Re-exploit buffer overflow to jump to the library function you now know the address of. We can do better with:

Return Oriented Programming

A gadget is the stuff immediately before the return call. Typically 1 or 2 instructions. If we can construct a basic language from a series of gadgets, we can write any program with a series of these gadgets. ROP is generally the state-of-the-art technique for injecting code for e.g. jailbreaks.

Stopping ROP

We can’t trivially. ROP falls out from fundamental decisions about how computers were architected that we made back in the 60s.

There are techniques we can use to make it harder though Shadow stacks keep a second stack for stack consistency checking and have the kernel kill the program if it ever gets out of sync (checked after each return) Full ASLR really randomise everything. Instruction Pointer Integrity Protections on return check that our instruction pointer only goes to whitelisted addresses.

This is only an issue due to buffer overflows. So you could always use a language without them (Rust).

Better explanation of Heap overflows

https://cs-uob.github.io/COMSM0049/lectures/3/slides.pdf page 63